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The necessity of the FORM project is discussed. Then the evolutionary needs in particle physics are considered, 
looking at the trends over the years. A guess is made at what will be needed in the (near) future. The whole is 
concluded with some critical remarks concerning the publication of results and programs. 



1. Why FORM? 

In particle theory we have categories of calcu- 
lations that are particularly demanding on hard- 
ware and software facilities. So much so that par- 
ticle theory has stood at the cradle of symbolic 
computation and also afterwards has made large 
contributions to it. Yet, as soon as a system be- 
comes bigger and bigger and commercially inter- 
esting, it often leaves its origins and it becomes 
more and more difficult to influence its develop- 
ment. 

Hence we need one or more systems of which we 
can influence the development. This way it can 
be optimized or close to optimized for our needs. 
The best case is if the author (s) is/are involved in 
our type of calculations. In the next best case we 
should be in the position to adapt a system by 
ourselves in order to avoid a very lengthy cycle 
of interaction with the authors. This asks for an 
open source system that is properly documented 
to make it as easy as possible for people to make 
additions. Moreover, a system should be readily 
available to all researchers. 

FORM |l|2|3j is supposed to fit these require- 
ments. To some extent it has already been like 
this, because I have been involved in many types 
of projects that other people in particle phe- 
nomenology are also engaged in. Like GiNaC [4] 
it is an 'in house' particle theory project with ap- 
plications to other fields of science. The fact that 
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FORM isn't open source is being worked at. 

One should also realize that FORM is heavily 
optimized for speed and the handling of very large 
expressions. Commercial systems have usually a 
different optimization target. An overwhelming 
fraction of commercial users doesn't have prob- 
lems that explore the limits of what is possible. 

2. Trends in Loops and Legs 

If one looks at the history of calculations in 
particle theory one sees a development over the 
years. 

At first the symbolic manipulation was to com- 
bine tensors and four vector dotproducts and 
manipulate gamma matrices. This was what 
Schoonschip [5] was designed for and also one of 
the first things that FORM could do. Because 
there are still people who think in these terms 
(nowadays mainly people who are not in parti- 
cle phenomenology) FORM has been stigmatized 
as a program that is only suitable for particle 
physics. 

An example of a reaction that was topline re- 
search in 1976 [6]: 7p — > t~t'^X e~ ^^vVi/UX. 

* 

* gaimna+proton -> tau- tau+ X -> e- (nu nubar) 

* inu+ (nu nubar) X 

* Narrow width approximation and full 

* spin-spin correllations. 
* 

S mtaUjinmUjine ,innut jinnum.mnue ; 

1 jl,j2,j3,j4,el; 

V pa , pb , ql , q2 , e2 , pi , p2 , p3 , p4 , p5 , p6 , pe , pm ; 

L F = 
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(g_(l,pl)+mnut*gi_(l))* 
(g_(2,pe)+me*gi_(2))* 
(g_ (3 , p4) +mnuiii*gi_ (3) ) * ( 

+g_(l,jl)*g7_(l)*(g_(l,ql)+mtau*gi_(l)) 
*g.(l,el) *(g_(l,ql)-g_(l,pa) 

+nitau*gi_(l))*g_(l,e2) 
*(-g_(l,q2)+mtau*gi_(l)) 
*g_ ( 1 , j 2) *g7_ ( 1) * (-1/2) /ql . pa 
*g_(2,jl)*g7_(2)*g_(3,j2)*g7_(3) 
+g_(l,jl)*g7_(l)*(g_(l,ql)+mtau*gi_(l)) 
*g.(l,e2)*(g_(l,pa)-g_(l,q2) 

+nitau*gi_(l))*g_(l,el) 
*(-g_(l,q2)+mtau*gi_(l)) 
*g_(l,j2)*g7_(l)*(-l/2)/q2.pa 
*g_(2,jl)*g7_(2)*g_(3,j2)*g7_(3) 

)* 

(g_ (1 ,p6)-mnut*gi_(l) ) * 
(g_(2,p3)-mnue*gi_(2))* 
(g_ (3,pni)-miiiu*gi_ (3) ) * ( 

+g_(l,j4)*g7_(l)*(-g_(l,q2)+mtau*gi_(l)) 
*g_(l,e2)*(g_(l,ql)-g_(l,pa) 

+mtau*gl_(l))*g_(l,el) 
*(g_(l,ql)+mtau*gl_(l)) 
*g_ ( 1 , j3) *g7_ ( 1) * (-1/2) /ql . pa 
*g_ (2 , j3) *g7_ (2) *g_ (3 , j4) *g7_ (3) 
+g_(l,j4)*g7_(l)*(-g_(l,q2)+mtau*gi_(l)) 
*g_(l,el)*(g_(l,pa)-g_(l,q2) 

+mtau*gi_(l))*g_(l,e2) 
*(g_(l,ql)+mtau*gl_(l)) 
*g_(l,j3)*g7_(l)*(-l/2)/q2.pa 
*g_ (2 , j3) *g7_ (2) *g_ (3 , j4) *g7_ (3) 
)/2-16; 
Trace4,3; 
Trace4,2; 
Trace4,l; 

id ql.ql = mtau"2; 
id q2.q2 = mtau"2; 
id pa. pa = 0; 
Print +f +s; 
. end 

Time = 0.21 sec Generated terms = 1992 
F Terms in output = 176 

Bytes used = 8552 

Next came the manipulation of loop integrals. 
At first the one loop integrals and their reduction 
to scalar loop integrals. Here is a very advanced 
example of a Feynman diagram calculated around 
the year 1983 [7]: 



Then (1989-2000) |8|9|10|ll|12j came also 
the manipulation of the three loop propagator 
graphs, done by many relations based on inte- 
gration by parts '13] . The expansions in terms of 
e posed additional requirements. These were the 
days of version 2 of FORM. 

In the late nineties new trends were emerg- 
ing. Not only new techniques for the rewriting 
of the diagrams in terms of master integrals were 
developed [14115116] . but also new methods for 
the treatment of the integrals themselves saw the 
daylight. Most notoriously methods with nested 
sums |17l and harmonic polylogarithms [18] . The 
rewriting started to need the solving of large sets 
of equations. Additionally the occasional calcu- 
lation of more and more complicated color fac- 
tors 19J required some additional types of topo- 
logical pattern matching. These last methods 
have not yet been explored very much as they 
will be needed mostly when there are very many 
loops. But they can also be very useful for com- 
plicated tensor algebra. Also the use of very large 
tables inspired new and original solutions. Ver- 
sion 3 of FORM has these methods in mind. 

By now the traces of the gamma matrices form 
just a very small corner in the space of its capa- 
bilities. 

Several types of calculations will always be at 
the limits of what is possible. A good example 
is the project at Karlsruhe of Baikov, Chetyrkin 
and Kiihn [20| . Given more power, they can do 
deeper calculations. Hence it is important to have 
FORM as powerful as possible. This is adressed 
by the ParFORM [2\ project and more recently 
this has led to TFORM [Sj. 

Another example of something that is in princi- 
ple open ended is the expression of multiple zeta 
values into a minimal set of variables [21] . This 
is the status at 2007(1 
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Time is real time on a computer with 8 Xeon cores 
at 2.3 GHz and 2 Gbytes of memory per core, 
running TFORM. 



The recent developments in massless multi- 
particle one loop amplitudes [22] hasn't led yet 
to particular symbolic projects. It is not clear 
whether it is needed. The methods with sector 
decomposition [23j are under development and 
again, it isn't 100% clear whether they need new 
developments in the symbolic sector. Possibly 
internal capabilities for treating combinations of 
sums, theta functions, delta functions and the 
splitting of factors with denominators could speed 
up the nested sums considerably. This would be 
very useful for the current methods based on the 
Mellin-Barnes approach [24] . This is however not 
entirely trivial, unless it will be too specific for 
a single problem. The automatic calculations as 
in GRACE [25] and CompHEP [26] can definitely 
use some new facilities in the field of code simpli- 
fication. 

I am probably forgetting a few things here. 

3. The current status 

What is the status of the FORM project? 

First in the field of manpower. Of course I 
myself work at the moment almost full time on 
FORM. In addition Misha Tentyukov is involved 
in the ParFORM project. He has been making 
other additions in the past citeExternal. At the 
moment Jens VoUinga is on a three year postdoc 
position at Nikhef which involves also work for 
FORM. He has already made the code for systems 
independent .sav files and is working on a failsafe 
system that (at some cost of course) allows one 
to set up checkpoints from which one can restart 



after a computer failure. He is also setting up a 
framework for documentation and provide better 
installation using the 'make install' approach. As 
part of a project grant of FOM we have the money 
to get a programmer for 18 months (starting in 
the autumn) to help with the project of code sim- 
plification. This is to facilitate the FORM version 
of GRACE citeCraceForm. 

Over the past few years TFORM |3] has been 
developed as a complement to ParFORM [5]. 
Each of the two has restrictions. ParFORM can 
operate on clusters and TFORM works only on 
multi core systems with shared memory. Because 
of the shared memory some things are much eas- 
ier in TFORM. Much administrative work needs 
only a single copy. Much multiple reading of files 
can be done with a relatively simple locking sys- 
tem. ParFORM has the advantage that clusters 
can have many more processors. But the commu- 
nication is much more complicated. Optimization 
of the programs is a field of research and may 
need some extra manpower. The ideal would be 
a system that can use clusters of multi core ma- 
chines. The problem is to reduce the bottlenecks 
so that for TV processors the execution time comes 
as close as possible to 1/N times the time needed 
on a single processor. Currently it is rather hard 
to get beyond 1/5 on 8 processors and beyond 
1/10 on 32 processors. A careful study and in- 
ventive solutions will be needed. 

Recently we have started to make use of the 
CMP (GNU Multiple Precision) [29] library for 
some of the computations with large integers. 
This is because the size of integers and fractions 
has become larger and larger and is often way be- 
yond what was envisioned originally. We use only 
the low level routines for multiplication, division 
and CCD calculations. The gain in speed isn't 
impressive though, because the algorithms inside 
FORM are rather decent (especially after the im- 
provements found in the end of 2006). But the 
GMP can do some things more efficiently because 
it has some assembler routines and in assembler 
one can do a number of things far more efficiently 
than in C. The need to convert from FORM no- 
tation to GMP notation introduces an overhead. 
We still need to experiment with what is the op- 
timal size below which we should use the original 
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routines and above which we should use the GMP 
hbrary. If one does calculations that involve frac- 
tions with very large integers (like hundreds of 
digits) one will find that the more recent versions 
(2007 and on) of FORM are noticeably faster. 

4. What to expect and hope for 

The systems of equations that need to be solved 
are asking often for capabilities with rational 
polynomials. This is particularly the case with 
the Laporta algorithm T6]. It is something that 
FORM doesn't have currently. Hence it has 
rather high priority to build this in. And to 
build this in in a rather efficient way. There ex- 
ist publicly available libraries for the manipula- 
tion of polynomials in a single variable, some of 
them claiming great efficiency, but there are no 
equivalent libraries for polynomials in many vari- 
ables. In addition there is the problem of no- 
tation. Too much time spent on conversion will 
not be beneficial. Currently the problem is under 
study. Most univariate algorithms (in particu- 
lar the GCD) have been implemented in various 
methods. This is by now reasonably fast. Fac- 
torization is less urgent, but can come in handy 
when constructing a system for simplification. 

It is also important to deal with multivariate 
rational polynomials efficiently when one likes to 
create a system for computing Grobner bases. 
There are however several ways to deal with poly- 
nomials and each way needs its own solution: 

• Small polynomials: when they take a small 
amount of space they can be kept inside the 
argument of a function. There may be bil- 
lions of such polynomials. They should be 
treated inside the regular workspace. Uni- 
variate polynomials will usually be in this 
category. An improvement in efficiency will 
be to tabulate a number of them. This is 
especially the case for factorization which is 
relatively expensive. 

• Intermediate polynomials: these could be 
handled by means of memory allocations as 
is done with the dollar variables. One could 
have hundreds or even thousands of them. 
Typically not billions. 



• Large polynomials: These are complete ex- 
pressions that could have billions of terms. 
Calculating their GCD would have to use 
the same mechanisms as by which expres- 
sions are treated. There should be only very 
few of these. 

An example of something that works already: 
PolyRatFun is an experimental statement that is 
similar to the Poly Fun statement, but now the 
function needs two arguments: a numerator and 
a denominator. 

Symbols x,y; 
CFunction pace; 
PolyRatFun pace; 

L F = pacc(x"2+x-3, (x+l)*(x+2))*y 

+pacc (x'2+3*x+l , (x+3) * (x+2) ) *y~2 ; 
Print +s ; 
. sort 

F = 

+ y*pacc(x~2 + x - 3,x'2 + 3*x + 2) 

+ y"2*pacc(x"2 + 3*x + l,x"2 + 5*x + 6) 

id y = 1; 
Print ; 
. end 

F = 

pacc(2*x~2 + 4*x - 4,x"2 + 4*x + 3); 

Sometimes one would like to have quick pri- 
vate additions for things that are extremely hard 
to program at the FORM level. Such things are 
often either of combinatoric nature or special pat- 
terns. It is of course impossible to forsee what 
some people will need. Hence FORM should be 
structured in such a way that it is possible to 
make such additions, even though this won't be 
for beginners. The first requirement for this is 
a good documentation of the inner workings, in- 
cluding a number of examples. The second re- 
quirement is code that can be understood and is 
structured properly. Due to these two require- 
ments FORM hasn't been released yet as open 
source. We hope to be this far in about two years 
time. 

As mentioned before, we like to have a way to 
introduce code simplification. This would be rel- 
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evant for all outputs that would need further nu- 
merical evaluation in the languages Fortran and 
C. If it is possible we would like to extend this 
to the regular output for as far as factorization is 
concerned. Already some things can be done at 
the FORM level, but this is usually rather slow. 
One can for instance make a procedure 'tryfactor' 
which would work like 

#do i = -100,100 

#call tryf actor (acc ,x+ ' i ' ) 

#enddo 

B acc ; 

Print ; 

and the answer might be like 

+acc (x-27) *acc (x+6) *acc (x+67) * ( ) 

Because this is very slow and requires guessing 
the factors, this is far from ideal. 

Something that the community should think 
about: At a given moment Nikhef, and/or I may 
not be able to take the responsibility for FORM 
any longer. Which institute/individual(s) can 
take over this responsibility? Would FORM dis- 
appear? It is not a good idea to depend on the 
free time of some individuals. The open source 
project may help, but this is probably not suf- 
ficient. There should be a professional commit- 
ment. Of course, if someone can come with a 
better product, evolution will take its course. But 
that would require a large investment as well. 
Good ideas are needed here, because it doesn't 
look like CERN (which would be the most natu- 
ral choice) is volunteering. 

5. Some critical remarks 

Some people prefer to use expensive commer- 
cial systems and give their results in terms of rou- 
tines for these systems. I believe this to be very 
shortsighted. 

Years ago we had a preprint system, and only 
the top universities would get the preprints and 
be up to date. Poor universities would not be 
able to be up to date and hence meaningful up 
to date research could only be done at a limited 
number of places. 



Now with the internet, everybody can be up 
to date and meaningful research can be done ev- 
erywhere. If however we present our results in 
the form of programs for very expensive software 
systems, we take a big step back. It is a form of 
elitism. 

I am not pushing here for my own program. 
What I want to say is that it is in the interest 
of science that all results are freely available and 
freely accessible, and that the threshold for using 
the results is as low as possible. 

If someone isn't happy with the facilities of- 
fered by the free software, spend some effort or 
resources on helping with providing such facili- 
ties. That is something that everybody can ben- 
efit of. 

Another thing (Remember Babylon): 
The situation becomes really chaotic when 
there are many complementary results from dif- 
ferent authors set up for different systems. It be- 
comes rapidly impossible to combine such results. 
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